New implementation of CMS_WCHARM_13TEV_WPWM-TOT-UNNORM #2244

achiefa · 2024-12-09T17:06:16Z

This PR implements CMS_WCHARM_13TEV_WPWM-TOT-UNNORM in the new format.

General comments

This dataset delivers the differential distribution in function of the absolute rapidity of the lepton pair. Each data point is accompanied by a (symmetric) statistical uncertainty and a (asymmetric) systematical uncertainty. The latter is the sum in quadrature of the different sources of uncertainty. The breakdown of these systematic sources is not delivered in the HepData format, but it is given in Table 1 of the paper.

The legacy version has the variant sys_10, which should not be implemented because it was meant to account for the 3pt prescription.

$(x,Q^2)$-map and data-theory comparison

Legacy: [default],
New: [default w/ shifts], [default w/o shifts]

achiefa · 2024-12-10T11:29:03Z

Hi @RoyStegeman , I think I'm done here. Please, see the following comments:

In this dataset, uncertainties must be symmetrised. This means that the central data must be shifted accordingly. However, I suspect that the old implementation didn't have the data shifted. I produced two comparisons (see description), w/ and w/o shift, respectively. The one that does not implement the shift gives the same chi2 as the legacy version. You can also check by naked eye that the non-shifted data are closer to the legacy data.
I also believe that the old implementation rounded some data, and indeed the central data in HepData do not match with the legacy implementation.
I can't reproduce the legacy covmat, but it might be that the cause is one or both of the differences above. Below, you can find the two matrices.

Honestly, I can't judge whether these differences are relevant or not. The difference in chi2 is not negligible if one accounts for the shifts. On the other hand, the difference in the t0 matrices does not really worry me as I was able to reproduce the chi2 of the legacy implementation provided shifts were removed. @RoyStegeman, what do you think? Maybe it is worth asking @enocera.

RoyStegeman · 2024-12-10T13:07:55Z

Do you know why the fktables of this dataset only exist in theories 704 (0.5,0.5) and 705 (0.5,1)?

achiefa · 2024-12-10T13:09:09Z

Do you know why the fktables of this dataset only exist in theories 704 (0.5,0.5) and 705 (0.5,1)?

No, maybe @enocera does.

RoyStegeman

I'm not so familiar with dataset implementations so this is going to take me some time to figure out...

For now I just have a question regarding the Extractor class. There are also a lot of unused imports, are you using an lsp?

nnpdf_data/nnpdf_data/commondata/CMS_WCHARM_13TEV/filter.py

nnpdf_data/nnpdf_data/commondata/CMS_WCHARM_13TEV/filter_utils.py

nnpdf_data/nnpdf_data/commondata/CMS_WCHARM_13TEV/metadata.yaml

nnpdf_data/nnpdf_data/commondata/CMS_WCHARM_13TEV/sys_uncertainties.py

achiefa · 2024-12-10T14:05:10Z

There are also a lot of unused imports, are you using an lsp?
These were copy and paste. My bad. BTW, what is a lsp?

RoyStegeman · 2024-12-10T14:06:59Z

language server protocol. It's the software that highlights tokens based on their role in the python syntax. Including unused imports

I'm pretty sure you are using it, but just in case you're not

achiefa · 2024-12-10T14:08:58Z

Oh, then I am. But I just forgot to delete the unused imports.

enocera · 2024-12-10T15:06:24Z

Do you know why the fktables of this dataset only exist in theories 704 (0.5,0.5) and 705 (0.5,1)?

No, maybe @enocera does.

I think that the reason is as follows: W+c data were not included in NNDPF4.0 because, at that time, NNLO corrections to the matrix elements were not known. When the MHOU, QED, and aN3LO determinations were produced, the leitmotiv was to put them on the same grounds as NNPDF4.0. Therefore W+c did not go into them. At some point I raised the question whether we should include it (in the same way as we do, e.g. for LHC data in the N3LO fit). Initially their answer was yes, but then they retracted. So I suspect that Andrea started to compute the FK tables, but then stopped.

achiefa · 2024-12-12T15:08:49Z

According to what ERN said in the last code meeting, this one is also ready for review.

RoyStegeman

Thanks! The implementation looks great and was much easier to go through. I checked the data-th plots and chi2 and the changes between the old and new implementations seem acceptable

nnpdf_data/nnpdf_data/commondata/CMS_WCHARM_13TEV/metadata.yaml

RoyStegeman · 2025-01-10T20:34:13Z

nnpdf_data/nnpdf_data/commondata/CMS_WCHARM_13TEV/filter_utils.py

+        does not have any active role in the fit. For that reason, every bin has the
+        same value. Moreover, only the mid value is used.


I'm not sure what you mean to say here. The fact that MW2 is the same for all bins is because it's a fixed parameter, whether it's used in the fit or not. Perhaps the point here is that even though it's fixed you still need to put the value in each bin because it's treated as a kinematic variable and the code needs to take those from the kinematics.yaml file (with the exception of sqrts, as you mentioned in the other comment)?

What I mean here is that MW2 is used only in the computation $(x,Q^2)$-plane for the kinematic coverage. Contrary to the case of sqrts, MW2 is not deduced by the name of the dataset. So you're point is right: despite MW2 is a fixed parameter, the code still needs it.

nnpdf_data/nnpdf_data/commondata/CMS_WCHARM_13TEV/filter_utils.py

nnpdf_data/nnpdf_data/commondata/CMS_WCHARM_13TEV/metadata.yaml

RoyStegeman · 2025-01-12T15:15:47Z

nnpdf_data/nnpdf_data/commondata/CMS_WCHARM_13TEV/metadata.yaml

  variants:
    legacy:
      data_uncertainties:
      - uncertainties_legacy_WPWM-TOT-UNNORM.yaml
    legacy_10:
      data_uncertainties:
-      - uncertainties_WPWM-TOT-UNNORM_sys_10.yaml
-  data_central: data_legacy_WPWM-TOT-UNNORM.yaml
+      - uncertainties_legacy_WPWM-TOT-UNNORM_sys_10.yaml


Actually, one question: should the legacy variant not also include the old central values?

Good point. Yes, it should, given that it has changed. I'll restore it and add it to the metadata.

I've included the legacy central data in the variants. I've also updated the report in the description for the default legacy implementation.

RoyStegeman

Thanks, once the tests pass this can be merged

achiefa requested a review from RoyStegeman December 9, 2024 17:06

achiefa marked this pull request as draft December 9, 2024 17:06

achiefa mentioned this pull request Dec 9, 2024

Final revision of the 4.0 dataset #2242

Open

5 tasks

achiefa self-assigned this Dec 9, 2024

achiefa added the data toolchain label Dec 9, 2024

achiefa marked this pull request as ready for review December 10, 2024 11:43

RoyStegeman reviewed Dec 10, 2024

View reviewed changes

achiefa force-pushed the new_CMS_WCHARM_13TEV_WPWM-TOT-UNNORM branch from f054206 to 342e1f5 Compare December 12, 2024 15:07

RoyStegeman force-pushed the new_CMS_WCHARM_13TEV_WPWM-TOT-UNNORM branch from 5ce9c42 to 0eb48b5 Compare December 24, 2024 11:23

achiefa added regenerate-data and removed regenerate-data labels Dec 30, 2024

RoyStegeman force-pushed the new_CMS_WCHARM_13TEV_WPWM-TOT-UNNORM branch 3 times, most recently from 35eced4 to 3dd2bc2 Compare January 10, 2025 18:42

RoyStegeman approved these changes Jan 12, 2025

View reviewed changes

RoyStegeman requested changes Jan 12, 2025

View reviewed changes

achiefa force-pushed the new_CMS_WCHARM_13TEV_WPWM-TOT-UNNORM branch from f0d272d to e83aac3 Compare January 13, 2025 10:45

RoyStegeman approved these changes Jan 13, 2025

View reviewed changes

achiefa added 4 commits January 13, 2025 16:00

New implementation of CMS_WCHARM_13TEV_WPWM-TOT-UNNORM

486e6c8

Implementation in the new format - WIP

f19ed69

Correct metadata

db6cb12

Correct bug in filter

5a440b1

achiefa and others added 12 commits January 13, 2025 16:00

Correct percentage for lumi description

220300d

Comment out unused bin

64148c1

Remove sys_10 variant

06daacf

Correct bug in data generation

e74f01f

Correct order for shifts

4e6a503

Clean code + remove unused code

5e98fcf

Add docstring in sys_uncertainties.py

db7524f

Clean up filter files

f79bed3

run pre-commit

10575e9

add some comments to new data implementation

831345f

Correct docstring

b95aaf1

Add legacy data central to legacy variants

be779ee

RoyStegeman force-pushed the new_CMS_WCHARM_13TEV_WPWM-TOT-UNNORM branch from e83aac3 to be779ee Compare January 13, 2025 16:00

RoyStegeman merged commit c7308f3 into master Jan 13, 2025
7 checks passed

RoyStegeman deleted the new_CMS_WCHARM_13TEV_WPWM-TOT-UNNORM branch January 13, 2025 17:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New implementation of CMS_WCHARM_13TEV_WPWM-TOT-UNNORM #2244

New implementation of CMS_WCHARM_13TEV_WPWM-TOT-UNNORM #2244

achiefa commented Dec 9, 2024 •

edited

Loading

achiefa commented Dec 10, 2024

RoyStegeman commented Dec 10, 2024

achiefa commented Dec 10, 2024

RoyStegeman left a comment

achiefa commented Dec 10, 2024

RoyStegeman commented Dec 10, 2024 •

edited

Loading

achiefa commented Dec 10, 2024

enocera commented Dec 10, 2024

achiefa commented Dec 12, 2024

RoyStegeman left a comment

RoyStegeman Jan 10, 2025

achiefa Jan 12, 2025

RoyStegeman Jan 12, 2025

achiefa Jan 12, 2025

achiefa Jan 13, 2025

RoyStegeman left a comment

		does not have any active role in the fit. For that reason, every bin has the
		same value. Moreover, only the mid value is used.

New implementation of CMS_WCHARM_13TEV_WPWM-TOT-UNNORM #2244

New implementation of CMS_WCHARM_13TEV_WPWM-TOT-UNNORM #2244

Conversation

achiefa commented Dec 9, 2024 • edited Loading

General comments

$(x,Q^2)$-map and data-theory comparison

achiefa commented Dec 10, 2024

RoyStegeman commented Dec 10, 2024

achiefa commented Dec 10, 2024

RoyStegeman left a comment

Choose a reason for hiding this comment

achiefa commented Dec 10, 2024

RoyStegeman commented Dec 10, 2024 • edited Loading

achiefa commented Dec 10, 2024

enocera commented Dec 10, 2024

achiefa commented Dec 12, 2024

RoyStegeman left a comment

Choose a reason for hiding this comment

RoyStegeman Jan 10, 2025

Choose a reason for hiding this comment

achiefa Jan 12, 2025

Choose a reason for hiding this comment

RoyStegeman Jan 12, 2025

Choose a reason for hiding this comment

achiefa Jan 12, 2025

Choose a reason for hiding this comment

achiefa Jan 13, 2025

Choose a reason for hiding this comment

RoyStegeman left a comment

Choose a reason for hiding this comment

achiefa commented Dec 9, 2024 •

edited

Loading

RoyStegeman commented Dec 10, 2024 •

edited

Loading